{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true,
    "pycharm": {
     "name": "#%% md\n"
    }
   },
   "source": [
    "# C10. 开箱即用"
   ]
  },
  {
   "source": [
    "## 10.1 模块\n",
    "\n"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "### 10.1.1 模块概述\n",
    "\n",
    "模块就是程序。任何 Python 程序都可以作为模块导入。"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "### 10.1.2 模块定义\n",
    "\n",
    "1.  在模块中定义函数可以重用代码。\n",
    "2.  在模块中添加测试代码用于检查代码是否预期。"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "### 10.1.3 模块使用\n",
    "\n",
    "1.  将模块放在正确的位置，搜索目录在 `sys.path` 中\n",
    "2.  告诉解释器到哪里去查找，在模块所在的目录包含在环境变量 PYTHONPATH 中"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "### 10.1.4 包\n",
    "\n",
    "为了组织模块，可以将其编组为包(package)。如果目录中包含文件 `__init__.py` 就会被当作包，包的内容在 `__init__.py` 文件中描述。"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "## 10.2 控制模块"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "### 10.2.1 模块结构"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "['Error', 'copy', 'deepcopy', 'dispatch_table', 'error']"
      ]
     },
     "metadata": {},
     "execution_count": 1
    }
   ],
   "source": [
    "# 使用函数 `dir` 列出对象的所有属性\n",
    "import copy\n",
    "[n for n in dir(copy) if not n.startswith('_')]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "['Error', 'copy', 'deepcopy']"
      ]
     },
     "metadata": {},
     "execution_count": 2
    }
   ],
   "source": [
    "# 使用变量 `__all__` 查询模块的公有接口\n",
    "from copy import *  # 就可以导入模块的公有接口中的所有函数\n",
    "copy.__all__"
   ]
  },
  {
   "source": [
    "### 10.2.2 模块帮助"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "Help on function copy in module copy:\n\ncopy(x)\n    Shallow copy operation on arbitrary Python objects.\n    \n    See the module's __doc__ string for more info.\n\n"
     ]
    }
   ],
   "source": [
    "help(copy.copy)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "Shallow copy operation on arbitrary Python objects.\n\n    See the module's __doc__ string for more info.\n    \n"
     ]
    }
   ],
   "source": [
    "print(copy.copy.__doc__)"
   ]
  },
  {
   "source": [
    "### 10.2.3 模块文档"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "Generic (shallow and deep) copying operations.\n\nInterface summary:\n\n        import copy\n\n        x = copy.copy(y)        # make a shallow copy of y\n        x = copy.deepcopy(y)    # make a deep copy of y\n\nFor module specific errors, copy.Error is raised.\n\nThe difference between shallow and deep copying is only relevant for\ncompound objects (objects that contain other objects, like lists or\nclass instances).\n\n- A shallow copy constructs a new compound object and then (to the\n  extent possible) inserts *the same objects* into it that the\n  original contains.\n\n- A deep copy constructs a new compound object and then, recursively,\n  inserts *copies* into it of the objects found in the original.\n\nTwo problems often exist with deep copy operations that don't exist\nwith shallow copy operations:\n\n a) recursive objects (compound objects that, directly or indirectly,\n    contain a reference to themselves) may cause a recursive loop\n\n b) because deep copy copies *everything* it may copy too much, e.g.\n    administrative data structures that should be shared even between\n    copies\n\nPython's deep copy operation avoids these problems by:\n\n a) keeping a table of objects already copied during the current\n    copying pass\n\n b) letting user-defined classes override the copying operation or the\n    set of components copied\n\nThis version does not copy types like module, class, function, method,\nnor stack trace, stack frame, nor file, socket, window, nor array, nor\nany similar types.\n\nClasses can use the same interfaces to control copying that they use\nto control pickling: they can define methods called __getinitargs__(),\n__getstate__() and __setstate__().  See the documentation for module\n\"pickle\" for information on these methods.\n\n"
     ]
    }
   ],
   "source": [
    "print(copy.__doc__)"
   ]
  },
  {
   "source": [
    "### 10.2.4 模块代码文件"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "C:\\ProgramData\\Anaconda3\\envs\\Begin\\lib\\copy.py\n"
     ]
    }
   ],
   "source": [
    "print(copy.__file__)"
   ]
  },
  {
   "source": [
    "## 10.3 标准库"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "### 10.3.1 sys\n",
    "\n",
    "访问与 Python 解释器相关的变量和函数。\n",
    "-   `arg`：命令行参数，包括脚本名\n",
    "-   `exit([arg])`：退出当前程序，通过可选参数指定返回值或者错误消息\n",
    "-   `modules`：字典类型，将模块名称映射到加载的模块\n",
    "-   `path`：列表类型，包含查找模块需要的目录名称\n",
    "-   `platform`：平台标识符，例如：win32 或者 sunos5\n",
    "-   `stdin`：标准输入流——类似于文件的对象\n",
    "-   `stdout`：标准输出流——类似于文件的对象\n",
    "-   `stderr`：标准错误流——类似于文件的对象"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "--f=C:\\Users\\ADMINI~1\\AppData\\Local\\Temp\\tmp-5556M6E2laOmip2i.json --iopub=9034 --transport=\"tcp\" --shell=9032 --Session.key=b\"3fc81e7c-ae5f-451d-b7dd-49c4f87b9510\" --Session.signature_scheme=\"hmac-sha256\" --hb=9030 --control=9031 --stdin=9033 --ip=127.0.0.1\n"
     ]
    }
   ],
   "source": [
    "# 反转并且打印命令行参数\n",
    "import sys\n",
    "args=sys.argv[1:]\n",
    "args.reverse()\n",
    "print(' '.join(args))"
   ]
  },
  {
   "source": [
    "### 10.3.2 os\n",
    "\n",
    "访问多个操作系统的服务。\n",
    "-   `environ`：包含环境变量的映射\n",
    "-   `system(command)`：在子 shell 中执行操作系统的命令\n",
    "-   `sep`：路径中使用的分隔符\n",
    "-   `pathsep`：分隔不同路径的分隔符\n",
    "-   `linesep`：行分隔符(例如：`\\n`, `\\r`, `\\r\\n`)\n",
    "-   `urandom(n)`：返回 n 个字节的强加密随机数据"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "### 10.3.3 fileinput\n",
    "\n",
    "读写文件模块\n",
    "-   `input([files[, inplace[, backup]]])`：帮助迭代多个输入流中的行\n",
    "-   `filename()`：返回当前文件的名称\n",
    "-   `lineno()`：返回(累计的)当前行号\n",
    "-   `filelinieno()`：返回在当前文件中的行号\n",
    "-   `isfirstline()`：检查当前行是否是文件中的第一行\n",
    "-   `isstdin()`：橙果最后一行是否来自 `sys.stdin`\n",
    "-   `nextfile()`：关闭当前文件并且移到下一个文件\n",
    "-   `close()`：关闭序列"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": "不能执行"
    }
   ],
   "source": [
    "# 在 Python 脚本中添加行号\n",
    "# 需要在 Python 命令行环境下运行\n",
    "import fileinput\n",
    "\n",
    "for line in fileinput.input(inplace=True):\n",
    "    line=line.rstrip()\n",
    "    num=fileinput.lineno()\n",
    "    print('{:<50} # {:2d}'.format(line,num))"
   ]
  },
  {
   "source": [
    "### 10.3.4 集合(set)\n",
    "\n",
    "集合是由模块 sets 中的 Set 类实现的。"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "{0, 1, 2, 3, 4, 5, 6, 7, 8, 9}\n<class 'dict'>\n<class 'set'>\n{1, 2, 3}\n{'foe', 'fee', 'fie'}\n"
     ]
    }
   ],
   "source": [
    "print(set(range(10)))\n",
    "# 空的花括号创建的是字典\n",
    "print(type({}))\n",
    "print(type({1,2,3}))\n",
    "\n",
    "# 集合会自动忽略重复的元素\n",
    "print({1,2,3,1})\n",
    "\n",
    "# 集合中元素没有确定的排列顺序\n",
    "print({'fee','fie','foe'})"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "并集\n{1, 2, 3, 4}\n{1, 2, 3, 4}\n交集\n{2, 3}\n{2, 3}\n子集\nTrue\nTrue\nFalse\n差集\n{1}\n对称差集\n{1, 4}\n{1, 4}\n复制\n{1, 2, 3}\nTrue\nFalse\n"
     ]
    }
   ],
   "source": [
    "# 集合的运算\n",
    "a={1,2,3}\n",
    "b={2,3,4}\n",
    "print(\"并集\")\n",
    "print(a.union(b))\n",
    "print(a|b)\n",
    "print(\"交集\")\n",
    "print(a.intersection(b))\n",
    "print(a&b)\n",
    "print(\"子集\")\n",
    "c=a&b\n",
    "print(c.issubset(a))\n",
    "print(c<=a)\n",
    "print(c>=a)\n",
    "print(\"差集\")\n",
    "print(a-b)\n",
    "print(\"对称差集\")\n",
    "print(a.symmetric_difference(b))\n",
    "print(a^b)\n",
    "print(\"复制\")\n",
    "print(a.copy())\n",
    "print(a.copy()==a)\n",
    "print(a.copy() is a)"
   ]
  },
  {
   "source": [
    "### 10.3.4 堆(heap)\n",
    "\n",
    "是一种优先队列。能以任何顺序添加对象，并且随时找出最小的元素进行处理。\n",
    "-   `heappush(heap,x)`：将 x 压入堆中\n",
    "-   `heappop(heap)`：从堆中弹出最小的元素\n",
    "-   `heapify(heap)`：让列表具备堆的特征\n",
    "-   `heapreplace(heap,x)`：弹出最小的元素，并且将 x 压入堆中\n",
    "-   `nlargest(n,iter)`：返回 iter 中 n 个最大的元素\n",
    "-   `nsmallest(n,iter)`：返回 iter 中 n 个最小的元素"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "heap= [0, 2, 1, 3, 4, 6, 8, 5, 9, 7]\nheap= [0, 0.5, 1, 3, 2, 6, 8, 5, 9, 7, 4]\n0\n0.5\n1\n"
     ]
    }
   ],
   "source": [
    "from heapq import *\n",
    "from random import shuffle\n",
    "\n",
    "data=list(range(10))\n",
    "shuffle(data)\n",
    "heap=[]\n",
    "for n in data:\n",
    "    heappush(heap,n)\n",
    "print(\"heap=\",heap)\n",
    "heappush(heap,0.5)\n",
    "print(\"heap=\",heap)\n",
    "print(heappop(heap))\n",
    "print(heappop(heap))\n",
    "print(heappop(heap))\n",
    "print(\"heap=\",heap)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "heap= [0, 1, 5, 3, 2, 7, 9, 8, 4, 6]\nheap= [0.5, 1, 5, 3, 2, 7, 9, 8, 4, 6]\nheap= [1, 2, 5, 3, 6, 7, 9, 8, 4, 10]\n"
     ]
    }
   ],
   "source": [
    "heap=[5,8,0,3,6,7,9,1,4,2]\n",
    "heapify(heap)   # 处理后可以发现堆是采用二叉树实现的\n",
    "print(\"heap=\",heap)\n",
    "heapreplace(heap,0.5)\n",
    "print(\"heap=\",heap)\n",
    "heapreplace(heap,10)\n",
    "print(\"heap=\",heap)"
   ]
  },
  {
   "source": [
    "### 10.3.4 双端队列(及其他集合)\n",
    "\n",
    "与集合一样，双端队列也是从可迭代对象创建的。"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "q= deque([0, 1, 2, 3, 4])\nq= deque([0, 1, 2, 3, 4, 5])\nq= deque([6, 0, 1, 2, 3, 4, 5])\nq= deque([6, 0, 1, 2, 3, 4])\nq= deque([0, 1, 2, 3, 4])\nq= deque([2, 3, 4, 0, 1])\nq= deque([3, 4, 0, 1, 2])\n"
     ]
    }
   ],
   "source": [
    "from collections import deque\n",
    "q=deque(range(5))\n",
    "print(\"q=\",q)\n",
    "q.append(5)\n",
    "print(\"q=\",q)\n",
    "q.appendleft(6)\n",
    "print(\"q=\",q)\n",
    "q.pop()\n",
    "print(\"q=\",q)\n",
    "q.popleft()\n",
    "print(\"q=\",q)\n",
    "q.rotate(3)\n",
    "print(\"q=\",q)\n",
    "q.rotate(-1)\n",
    "print(\"q=\",q)\n"
   ]
  },
  {
   "source": [
    "### 10.3.5 time\n",
    "\n",
    "时间和日期处理函数\n",
    "-   `asctime([tuple])`：将时间元组转换为字符串\n",
    "-   `localtime([secs])`：将秒数转换为表示当地时间的日期元组\n",
    "-   `mktime(tuple)`：将时间元组转换为当地时间\n",
    "-   `sleep(secs)`：休眠 secs 秒\n",
    "-   `strptime(string[, format])`：将字符串转换为时间元组\n",
    "-   `time()`：当前时间(从新纪元开始后的秒数，以 UTC 为准)"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "Tue Feb 23 18:45:42 2021\n1614077142.1256776\n"
     ]
    }
   ],
   "source": [
    "import time\n",
    "print(time.asctime())\n",
    "print(time.time())"
   ]
  },
  {
   "source": [
    "### 10.3.6 random\n",
    "\n",
    "生成伪随机数的函数：\n",
    "-   `random()`：返回一个 $(0,1]$ 的随机实数\n",
    "-   `getrandbits(n)`：以长整数方式返回 n 个随机的二进制位\n",
    "-   `uniform(a,b)`：返回一个 $(a,b]$ 的随机实数\n",
    "-   `randrange([start], stop, [step])`：从 range(start, stop, step) 中随机地选择一个数\n",
    "-   `choice(seq)`：从序列 seq 中随机地选择一个元素\n",
    "-   `shuffle(seq[, random])`：就地打乱序列 seq\n",
    "-   `sample(seq, n)`：从序列 seq 中随机地选择 n 个不同的元素"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "time1= 1451577600.0\ntime2= 1483200000.0\nTue Dec  6 06:15:07 2016\n"
     ]
    }
   ],
   "source": [
    "from random import *\n",
    "from time import *\n",
    "\n",
    "date1=(2016,1,1,0,0,0,-1,-1,-1)\n",
    "time1=mktime(date1)\n",
    "\n",
    "date2=(2017,1,1,0,0,0,-1,-1,-1)\n",
    "time2=mktime(date2)\n",
    "\n",
    "random_time=uniform(time1,time2)\n",
    "print(\"time1=\",time1)\n",
    "print(\"time2=\",time2)\n",
    "print(asctime(localtime(random_time)))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "The result is  17\n"
     ]
    }
   ],
   "source": [
    "from random import randrange\n",
    "\n",
    "num=int(input(\"How many dice?\"))\n",
    "sides=int(input(\"How many sides per die?\"))\n",
    "sum=0\n",
    "for i in range(num): sum+=randrange(sides)+1\n",
    "print(\"The result is \",sum)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "seq= [7, 0, 1, 4, 6, 3, 3, 5, 6, 5, 2, 4, 8, 2, 9, 1, 7, 9, 0, 8]\nsample(seq, n)= [4, 7, 8, 0, 9, 7, 5, 1, 1, 2]\n"
     ]
    }
   ],
   "source": [
    "from random import *\n",
    "\n",
    "seq=list(range(10))\n",
    "seq=seq+seq\n",
    "shuffle(seq)\n",
    "print(\"seq=\",seq)\n",
    "print(\"sample(seq, n)=\", sample(seq, 10))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "['1 of diamonds', '1 of clubs', '1 of hearts', '1 of spades', '2 of diamonds', '2 of clubs', '2 of hearts', '2 of spades', '3 of diamonds', '3 of clubs', '3 of hearts', '3 of spades']\n['4 of clubs', '10 of hearts', '7 of spades', '5 of clubs', 'Queen of clubs', 'Queen of diamonds', '1 of clubs', '5 of diamonds', '7 of hearts', '10 of clubs', 'King of clubs', '7 of clubs']\n"
     ]
    }
   ],
   "source": [
    "values=list(range(1,11))+'Jack Queen King'.split()\n",
    "suits='diamonds clubs hearts spades'.split()\n",
    "deck=['{} of {}'.format(v,s) for v in values for s in suits]\n",
    "print(deck[:12])\n",
    "shuffle(deck)\n",
    "print(deck[:12])"
   ]
  },
  {
   "source": [
    "### 10.3.7 shelve & json"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "['a', 'b', 'c']\n['a', 'b', 'c', 'd']\n"
     ]
    }
   ],
   "source": [
    "import shelve\n",
    "\n",
    "s=shelve.open('test.dat')\n",
    "s['x']=['a','b','c']\n",
    "s['x'].append('d')\n",
    "# d 会丢失\n",
    "print(s['x'])\n",
    "\n",
    "temp=s['x']\n",
    "temp.append('d')\n",
    "s['x']=temp\n",
    "print(s['x'])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def store_person(db):\n",
    "    pid=input(\"Enter unique ID number:\")\n",
    "    person={}\n",
    "    person['name']=input(\"Enter name:\")\n",
    "    person['age']=input(\"Enter age:\")\n",
    "    person['phone']=input(\"Enter phone number:\")\n",
    "    db[pid]=person\n",
    "\n",
    "def lookup_person(db):\n",
    "    pid=input(\"Enter ID number:\")\n",
    "    field=input(\"What would you like to know? (name, age, phone)\")\n",
    "    field=field.strip().lower()\n",
    "    print(field.captitalize()+':', db[pid][field])\n",
    "\n",
    "def print_help():\n",
    "    print(\"The available commands are:\")\n",
    "    print(\"store: Stores information about a person\")\n",
    "    print(\"lookup: Looks up a person from ID number\")\n",
    "    print(\"quit: Save changes and exit\")\n",
    "    print(\"?: Prints this message\")\n",
    "\n",
    "def enter_command():\n",
    "    cmd=input(\"Enter command (? for help):\")\n",
    "    cmd=cmd.strip().lower()\n",
    "    return cmd\n",
    "\n",
    "def main():\n",
    "    database=shelve.open(\"database.dat\")\n",
    "    try:\n",
    "        while True:\n",
    "            cmd=enter_command()\n",
    "            if cmd=='store':\n",
    "                store_person(database)\n",
    "            elif cmd=='lookup':\n",
    "                lookup_person(database)\n",
    "            elif cmd=='?':\n",
    "                print_help()\n",
    "            elif cmd=='quit':\n",
    "                return\n",
    "    finally:\n",
    "        database.close()"
   ]
  },
  {
   "source": [
    "### 10.3.8 re\n",
    "\n",
    "正则表达式是什么？正则表达式是可匹配文本片段的模式。\n",
    "-   通配符：正则表达式可与多个字符串匹配，可以使用特殊字符来创建正则表达式\n",
    "    -   句点与除换行符外的任何字符都匹配，因此称为通配符(wildcard)\n",
    "-   对特殊字符进行转义：要让特殊字符的行为与普通字符一样，需要对其进行转义\n",
    "-   字符集：匹配任何字符都可以使用。使用方括号将一个子串括起，创建一个字符集。这样的字符集与其包含的字符都匹配。\n",
    "-   二选一和子模式：管道字符(|)表示二选一；单个字符称为子模式\n",
    "-   可选模式和重复模式：在子模式后面加上问号，可将其指定为可选的，即可以包含也可以不包含。\n",
    "    -   `(pattern)*`：pattern 可以重复 0 次、1 次 或者 多次\n",
    "    -   `(pattern)+`：pattern 可以重复 1 次 或者 多次\n",
    "    -   `(pattern){m,n}`：pattern 可以重复 m~n 次\n",
    "-   字符串的开头和末尾：脱字符(^)指定字符串的开头；美元符号($)指定字符串的结尾"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "模块 re 的内容：\n",
    "-   `compile(pattern[, flags])`：根据包含正则表达式的字符串创建模式对象\n",
    "-   `search(pattern, string[, flags])`：在字符串中查找模式\n",
    "-   `match(pattern, string[, flags])`：在字符串开头匹配模式\n",
    "-   `split(pattern, string[, maxsplit=0])`：根据模式来分割字符串\n",
    "-   `findall(pattern, string)`：返回一个列表，其中包含字符串中所有与模式匹配的子串\n",
    "-   `sub(pat, repl, string[, count=0])`：将字符串中与模式 pat 匹配的子串都替换为 repl\n",
    "-   `escape(string)`：对字符串中所有的正则表达式特殊字符都进行转义"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "匹配对象和编组\n",
    "-   `group([group1, ...])`：获取与给定子模式(编组)匹配的子串\n",
    "-   `start([group])`：返回与给定编组匹配的子串的起始位置\n",
    "-   `end([group])`：返回与给定编组匹配的子串的终止位置\n",
    "-   `span([group])`：返回与给定编组匹配的子串的起始和终止位置"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "output_type": "stream",
     "name": "stdout",
     "text": [
      "python\n4\n10\n(4, 10)\n"
     ]
    }
   ],
   "source": [
    "import re\n",
    "m=re.match(r'www\\.(.*)\\..{3}','www.python.org')\n",
    "print(m.group(1))\n",
    "print(m.start(1))\n",
    "print(m.end(1))\n",
    "print(m.span(1))"
   ]
  },
  {
   "source": [
    "替换中的组号和函数\n",
    "\n"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "'Hello, <em>world</em>!'"
      ]
     },
     "metadata": {},
     "execution_count": 1
    }
   ],
   "source": [
    "import re\n",
    "# 要让正则表达式容易理解，可以在调用模块 re 中的函数时使用标志 VERBOSE\n",
    "# 标志 VERBOSE 可以在模式中添加空白(空格、制表符、换行符等)，还可以添加注释\n",
    "emphasis_pattern=re.compile(r'''\n",
    "\\*          # 起始突出标志——一个星号\n",
    "(           # 与要突出的内容匹配的编组的起始位置\n",
    "[^\\*]+      # 与除星号外的其他字符都匹配\n",
    ")           # 编组至此结束\n",
    "\\*          # 结束突出标志\n",
    "''',re.VERBOSE)\n",
    "re.sub(emphasis_pattern,r'<em>\\1</em>','Hello, *world*!')"
   ]
  },
  {
   "source": [
    "贪婪模式：重复运算符默认都是贪婪的，意味着将匹配尽可能多的内容。\n",
    "\n",
    "非贪婪模式：在重复运算符后加上问号(?)，将匹配将可能少的内容。"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [
    {
     "output_type": "error",
     "ename": "FileNotFoundError",
     "evalue": "[Errno 2] No such file or directory: '--ip=127.0.0.1'",
     "traceback": [
      "\u001b[1;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[1;31mFileNotFoundError\u001b[0m                         Traceback (most recent call last)",
      "\u001b[1;32m<ipython-input-3-11200d9f6dd3>\u001b[0m in \u001b[0;36m<module>\u001b[1;34m\u001b[0m\n\u001b[0;32m     19\u001b[0m \u001b[1;31m# 获取所有文本并合并成一个字符串\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     20\u001b[0m \u001b[0mlines\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;33m[\u001b[0m\u001b[1;33m]\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m---> 21\u001b[1;33m \u001b[1;32mfor\u001b[0m \u001b[0mline\u001b[0m \u001b[1;32min\u001b[0m \u001b[0mfileinput\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0minput\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m     22\u001b[0m     \u001b[0mlines\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mappend\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mline\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m     23\u001b[0m \u001b[0mtext\u001b[0m\u001b[1;33m=\u001b[0m\u001b[1;34m''\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mjoin\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mlines\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\ProgramData\\Anaconda3\\envs\\Begin\\lib\\fileinput.py\u001b[0m in \u001b[0;36m__next__\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m    248\u001b[0m     \u001b[1;32mdef\u001b[0m \u001b[0m__next__\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    249\u001b[0m         \u001b[1;32mwhile\u001b[0m \u001b[1;32mTrue\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 250\u001b[1;33m             \u001b[0mline\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_readline\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    251\u001b[0m             \u001b[1;32mif\u001b[0m \u001b[0mline\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    252\u001b[0m                 \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_filelineno\u001b[0m \u001b[1;33m+=\u001b[0m \u001b[1;36m1\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;32mC:\\ProgramData\\Anaconda3\\envs\\Begin\\lib\\fileinput.py\u001b[0m in \u001b[0;36m_readline\u001b[1;34m(self)\u001b[0m\n\u001b[0;32m    360\u001b[0m                     \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_file\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_openhook\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_filename\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_mode\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    361\u001b[0m                 \u001b[1;32melse\u001b[0m\u001b[1;33m:\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[1;32m--> 362\u001b[1;33m                     \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_file\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mopen\u001b[0m\u001b[1;33m(\u001b[0m\u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_filename\u001b[0m\u001b[1;33m,\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_mode\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0m\u001b[0;32m    363\u001b[0m         \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_readline\u001b[0m \u001b[1;33m=\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_file\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0mreadline\u001b[0m  \u001b[1;31m# hide FileInput._readline\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n\u001b[0;32m    364\u001b[0m         \u001b[1;32mreturn\u001b[0m \u001b[0mself\u001b[0m\u001b[1;33m.\u001b[0m\u001b[0m_readline\u001b[0m\u001b[1;33m(\u001b[0m\u001b[1;33m)\u001b[0m\u001b[1;33m\u001b[0m\u001b[1;33m\u001b[0m\u001b[0m\n",
      "\u001b[1;31mFileNotFoundError\u001b[0m: [Errno 2] No such file or directory: '--ip=127.0.0.1'"
     ]
    }
   ],
   "source": [
    "# 10-11：模板系统，不能在Notebook中执行\n",
    "import fileinput,re\n",
    "\n",
    "# 与使用方括号括起的字段匹配\n",
    "field_pat=re.compile(r'\\[(.+?)\\]')\n",
    "# 将变量收集到 scope\n",
    "scope={}\n",
    "\n",
    "# 用于调用 re.sub\n",
    "def replacement(match):\n",
    "    code=match.group(1)\n",
    "    # 如果字段为表达式，就返回其结果\n",
    "    try: return str(eval(code,scope))\n",
    "    except SyntaxError:\n",
    "        # 否则在当前作用域内执行这个赋值语句\n",
    "        exec(code,scope)\n",
    "        return ''\n",
    "\n",
    "# 获取所有文本并合并成一个字符串\n",
    "lines=[]\n",
    "for line in fileinput.input():\n",
    "    lines.append(line)\n",
    "text=''.join(lines)\n",
    "print(field_pat.sub(replacement,text))"
   ]
  },
  {
   "source": [
    "### 10.3.9 其他标准模块\n",
    "\n",
    "-   argparse：用于解析命令行的参数\n",
    "-   cmd：编写类似于 Python 交互式解释器的命令行解释器\n",
    "-   csv：CSV 指的是逗号分隔的值(comma-seperated values)，模块能够轻松地读写 CSV 文件\n",
    "-   datetime：支持特殊的日期和时间对象，并且让你能够以各种方式创建和合并这些对象\n",
    "-   difflib：确定两个序列的相似程度\n",
    "-   enum：枚举类型，是一种只有少数几个可能取值的类型\n",
    "-   functools：能够在调用函数时只提供部分参数\n",
    "-   hashlib：计算字符串的小型「签名」(哈希值)\n",
    "-   itertools：用于创建和合并迭代器(或者其他可以迭代的对象)的工具，包括：串接可迭代是，创建返回无限连续整数的迭代器，反复遍历可迭代对象\n",
    "-   logging：将信息写入到日志文件 ，可以管理一个或者多个中央日志，支持多种不同优先级的日志消息\n",
    "-   statistics：进行统计计算\n",
    "-   timeit：测量代码段执行时间的工具\n",
    "-   profile：对代码段的效率进行更全面的分析\n",
    "-   trace：覆盖率分析，用于编写测试代码"
   ],
   "cell_type": "markdown",
   "metadata": {}
  },
  {
   "source": [
    "## 10.4 小结\n",
    "\n",
    "-   模块：是一个子程序，用于定义函数、类和变量\n",
    "-   包：包含其他模块的模块，包是使用包含文件 `__init__.py` 的目录实现的\n",
    "-   探索模块：在交互式解释器中导入模块后，就可以众多不同的方式对其进行探索，其中包括使用 `dir`、查看变量 `__all__` 以及使用函数 `help`\n",
    "-   标准库：Python 自带的模块\n",
    "    -   sys：能够访问多个与 Python 解释器关系紧密的变量和函数\n",
    "    -   os：能够访问多个与 OS 关系紧密的变量和函数\n",
    "    -   fileinput：能够迭代多个文件或者流的内容行\n",
    "    -   sets, heapq, deque：提供了三种数据结构\n",
    "    -   time：获取当前时间、操作时间和日期以及设置它们的格式\n",
    "    -   random：：用于生成随机数，从序列中随机地选择元素，以及打乱列表中元素的函数\n",
    "    -   shelve：用于创建永久性映射，内容存储在使用给定文件名的数据库中\n",
    "    -   re：支持正则表达式"
   ],
   "cell_type": "markdown",
   "metadata": {}
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 2
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython2",
   "version": "3.6.12-final"
  },
  "pycharm": {
   "stem_cell": {
    "cell_type": "raw",
    "source": [],
    "metadata": {
     "collapsed": false
    }
   }
  }
 },
 "nbformat": 4,
 "nbformat_minor": 0
}